<div> K-sparse encoder for efficient information retrieval</div> Open database of scientific publications ITMO UNIVERSITY

K-sparse encoder for efficient information retrieval

Journal

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Dobrynin Viacheslav Yu.

UDK004.89

Issue:4 (162)

Download PDF0 Kbyte

Annotation

Modern industrial search engines typically employ a two-stage pipeline: fast candidate retrieval followed by reranking. This approach inevitably leads to the loss of some relevant documents due to the simplicity of algorithms used in the first stage. This work proposes a single-stage approach that combines the advantages of dense semantic search models with the efficiency of inverted indices. The key component of the solution is a K-sparse encoder used to convert dense vectors into sparse ones compatible with inverted indices of the Lucene library. In contrast to the previously studied identifiable variational autoencoder, the proposed model is based on an autoencoder with a TopK activation function which explicitly enforces a fixed number of non-zero coordinates during training. This activation function makes the sparse vector generation process differentiable, eliminates the need for post-processing, and simplifies the loss function to a sum of reconstruction error and a component preserving relative distances between dense and sparse representations. The model was trained on a 300,000-document subset of the MS MARCO dataset using PyTorch and an NVIDIA L4 GPU. The proposed model achieves 96.6 % of the quality of the original dense model in terms of the NDCG@10 metric (0.57 vs. 0.59) on the SciFact dataset with 80 % sparsity. It is also shown that further increasing sparsity reduces index size and improves retrieval speed while maintaining acceptable search quality. In terms of memory usage, the approach outperforms the Hierarchical Navigable Small World (HNSW) graph-based algorithm, and at high sparsity levels, its speed approaches that of HNSW. The results confirm the applicability of the proposed approach to unstructured data retrieval. Direct control over sparsity enables balancing between search quality, latency, and memory requirements. Thanks to the use of an inverted index based on the Lucene library, the proposed solution is well suited for industrial- scale search systems. Future research directions include interpretability of the extracted features and improving retrieval quality under high sparsity conditions.

K-sparse encoder for efficient information retrieval

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Annotation

Keywords

Постоянный URL

Articles in current issue

K-sparse encoder for efficient information retrieval

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Annotation

Keywords

Постоянный URL

Поделиться

Articles in current issue